Clique counting in MapReduce: theory and experiments

نویسندگان

  • Irene Finocchi
  • Marco Finocchi
  • Emanuele G. Fusco
چکیده

We present exact and approximate MapReduce estimators for the number of cliques of size k in an undirected graph, for any small constant k ≥ 3. Besides theoretically analyzing our algorithms in the computational model for MapReduce introduced by Karloff, Suri, and Vassilvitskii, we present the results of extensive computational experiments on the Amazon EC2 platform. Our experiments show the practical effectiveness of our algorithms even on clusters of small/medium size, and suggest their scalability to larger clusters.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scalable Scientific Computing Algorithms Using MapReduce

Cloud computing systems, like MapReduce and Pregel, provide a scalable and fault tolerant environment for running computations at massive scale. However, these systems are designed primarily for data intensive computational tasks, while a large class of problems in scientific computing and business analytics are computationally intensive (i.e., they require a lot of CPU in addition to I/O). In ...

متن کامل

Mining maximal cliques from a large graph using MapReduce: Tackling highly uneven subproblem sizes

We consider Maximal Clique Enumeration (MCE) from a large graph. A maximal clique is perhaps the most fundamental dense substructure in a graph, and MCE is an important tool to discover densely connected subgraphs, with numerous applications to data mining on web graphs, social networks, and biological networks. While effective sequential methods for MCE are known, scalable parallel methods for...

متن کامل

Lessons from the Congested Clique Applied to MapReduce

The main results of this paper are (I) a simulation algorithm which, under quite general constraints, transforms algorithms running on the Congested Clique into algorithms running in the MapReduce model, and (II) a distributed O(∆)-coloring algorithm running on the Congested Clique which has an expected running time of O(1) rounds, if ∆ ≥ Θ(log n); and O(log log log n) rounds otherwise. Applyin...

متن کامل

Exploiting Problem Structure for Solution Counting

This paper deals with the challenging problem of counting the number of solutions of a CSP, denoted #CSP. Recent progress have been made using search methods, such as BTD [15], which exploit the constraint graph structure in order to solve CSPs. We propose to adapt BTD for solving the #CSP problem. The resulting exact counting method has a worst-case time complexity exponential in a specific gr...

متن کامل

Solution Counting for CSP and SAT with Large Tree-Width

This paper deals with the challenging problem of counting the number of solutions of a CSP, denoted #CSP. Recent progress has been made using search methods, such as Backtracking with Tree-Decomposition (BTD) [Jégou and Terrioux, 2003], which exploit the constraint graph structure in order to solve CSPs. We propose to adapt BTD for solving the #CSP problem. The resulting exact counting method h...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014